Overview

Dataset statistics

Number of variables26
Number of observations30000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 MiB
Average record size in memory152.0 B

Variable types

NUM15
CAT11

Warnings

BILL_AMT2 is highly correlated with BILL_AMT1 and 1 other fieldsHigh correlation
BILL_AMT1 is highly correlated with BILL_AMT2High correlation
BILL_AMT3 is highly correlated with BILL_AMT2 and 1 other fieldsHigh correlation
BILL_AMT4 is highly correlated with BILL_AMT3 and 2 other fieldsHigh correlation
BILL_AMT5 is highly correlated with BILL_AMT4 and 1 other fieldsHigh correlation
BILL_AMT6 is highly correlated with BILL_AMT4 and 1 other fieldsHigh correlation
PAY_AMT2 is highly skewed (γ1 = 30.45381745) Skewed
df_index has unique values Unique
ID has unique values Unique
BILL_AMT1 has 2008 (6.7%) zeros Zeros
BILL_AMT2 has 2506 (8.4%) zeros Zeros
BILL_AMT3 has 2870 (9.6%) zeros Zeros
BILL_AMT4 has 3195 (10.7%) zeros Zeros
BILL_AMT5 has 3506 (11.7%) zeros Zeros
BILL_AMT6 has 4020 (13.4%) zeros Zeros
PAY_AMT1 has 5249 (17.5%) zeros Zeros
PAY_AMT2 has 5396 (18.0%) zeros Zeros
PAY_AMT3 has 5968 (19.9%) zeros Zeros
PAY_AMT4 has 6408 (21.4%) zeros Zeros
PAY_AMT5 has 6703 (22.3%) zeros Zeros
PAY_AMT6 has 7173 (23.9%) zeros Zeros

Reproduction

Analysis started2020-12-07 18:02:08.694807
Analysis finished2020-12-07 18:04:45.019794
Duration2 minutes and 36.32 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct30000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15202.1399
Minimum1
Maximum30203
Zeros0
Zeros (%)0.0%
Memory size234.4 KiB
2020-12-07T12:04:45.368841image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1703.95
Q17703.75
median15203.5
Q322703.25
95-th percentile28703.05
Maximum30203
Range30202
Interquartile range (IQR)14999.5

Descriptive statistics

Standard deviation8662.753906
Coefficient of variation (CV)0.5698377967
Kurtosis-1.198695871
Mean15202.1399
Median Absolute Deviation (MAD)7500
Skewness-0.0009415584061
Sum456064197
Variance75043305.23
MonotocityStrictly increasing
2020-12-07T12:04:45.830323image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20471< 0.1%
 
238731< 0.1%
 
299881< 0.1%
 
258941< 0.1%
 
279431< 0.1%
 
54161< 0.1%
 
74651< 0.1%
 
13221< 0.1%
 
33711< 0.1%
 
136121< 0.1%
 
Other values (29990)29990> 99.9%
 
ValueCountFrequency (%) 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
302031< 0.1%
 
302021< 0.1%
 
302011< 0.1%
 
302001< 0.1%
 
301991< 0.1%
 

ID
Categorical

UNIQUE

Distinct30000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
6185
 
1
10133
 
1
14008
 
1
12237
 
1
1988
 
1
Other values (29995)
29995 
ValueCountFrequency (%) 
61851< 0.1%
 
101331< 0.1%
 
140081< 0.1%
 
122371< 0.1%
 
19881< 0.1%
 
193671< 0.1%
 
291731< 0.1%
 
295611< 0.1%
 
59311< 0.1%
 
120921< 0.1%
 
Other values (29990)29990> 99.9%
 
2020-12-07T12:04:46.503699image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique30000 ?
Unique (%)100.0%
2020-12-07T12:04:46.894374image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length4.6298
Min length1

LIMIT_BAL
Real number (ℝ≥0)

Distinct81
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean167484.3227
Minimum10000
Maximum1000000
Zeros0
Zeros (%)0.0%
Memory size117.2 KiB
2020-12-07T12:04:47.566250image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile20000
Q150000
median140000
Q3240000
95-th percentile430000
Maximum1000000
Range990000
Interquartile range (IQR)190000

Descriptive statistics

Standard deviation129747.6616
Coefficient of variation (CV)0.7746854124
Kurtosis0.5362628964
Mean167484.3227
Median Absolute Deviation (MAD)90000
Skewness0.9928669605
Sum729562384
Variance1.683445568e+10
MonotocityNot monotonic
2020-12-07T12:04:48.206821image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
50000336511.2%
 
2000019766.6%
 
3000016105.4%
 
8000015675.2%
 
20000015285.1%
 
15000011103.7%
 
10000010483.5%
 
1800009953.3%
 
3600008812.9%
 
600008252.8%
 
Other values (71)1509550.3%
 
ValueCountFrequency (%) 
100004931.6%
 
160002< 0.1%
 
2000019766.6%
 
3000016105.4%
 
400002300.8%
 
ValueCountFrequency (%) 
10000001< 0.1%
 
8000002< 0.1%
 
7800002< 0.1%
 
7600001< 0.1%
 
7500004< 0.1%
 

SEX
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
female
18112 
male
11888 
ValueCountFrequency (%) 
female1811260.4%
 
male1188839.6%
 
2020-12-07T12:04:48.878697image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:49.159948image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:49.519322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length6
Mean length5.207466667
Min length4

EDUCATION
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
university
14030 
graduate school
10585 
high school
4917 
other
 
468
ValueCountFrequency (%) 
university1403046.8%
 
graduate school1058535.3%
 
high school491716.4%
 
other4681.6%
 
2020-12-07T12:04:49.998171image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:50.279418image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:51.248170image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length11
Mean length11.85006667
Min length5

MARRIAGE
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
2
15964 
1
13659 
3
 
323
0
 
54
ValueCountFrequency (%) 
21596453.2%
 
11365945.5%
 
33231.1%
 
0540.2%
 
2020-12-07T12:04:51.836087image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:52.314193image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:52.798518image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

AGE
Real number (ℝ≥0)

Distinct56
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.4855
Minimum21
Maximum79
Zeros0
Zeros (%)0.0%
Memory size117.2 KiB
2020-12-07T12:04:53.236023image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile23
Q128
median34
Q341
95-th percentile53
Maximum79
Range58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.217904068
Coefficient of variation (CV)0.2597653709
Kurtosis0.04430337824
Mean35.4855
Median Absolute Deviation (MAD)6
Skewness0.7322458688
Sum1064565
Variance84.96975541
MonotocityNot monotonic
2020-12-07T12:04:53.579821image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2916055.3%
 
2714774.9%
 
2814094.7%
 
3013954.7%
 
2612564.2%
 
3112174.1%
 
2511864.0%
 
3411623.9%
 
3211583.9%
 
3311463.8%
 
Other values (46)1698956.6%
 
ValueCountFrequency (%) 
21670.2%
 
225601.9%
 
239313.1%
 
2411273.8%
 
2511864.0%
 
ValueCountFrequency (%) 
791< 0.1%
 
753< 0.1%
 
741< 0.1%
 
734< 0.1%
 
723< 0.1%
 

PAY_0
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
14737 
-1
5686 
1
3688 
-2
2759 
2
2667 
Other values (6)
 
463
ValueCountFrequency (%) 
01473749.1%
 
-1568619.0%
 
1368812.3%
 
-227599.2%
 
226678.9%
 
33221.1%
 
4760.3%
 
5260.1%
 
8190.1%
 
611< 0.1%
 
2020-12-07T12:04:54.311055image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:54.811052image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.2815
Min length1

PAY_2
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
15730 
-1
6050 
2
3927 
-2
3782 
3
 
326
Other values (6)
 
185
ValueCountFrequency (%) 
01573052.4%
 
-1605020.2%
 
2392713.1%
 
-2378212.6%
 
33261.1%
 
4990.3%
 
1280.1%
 
5250.1%
 
7200.1%
 
612< 0.1%
 
2020-12-07T12:04:55.420475image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-07T12:04:55.920428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.327733333
Min length1

PAY_3
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
15764 
-1
5938 
-2
4085 
2
3819 
3
 
240
Other values (6)
 
154
ValueCountFrequency (%) 
01576452.5%
 
-1593819.8%
 
-2408513.6%
 
2381912.7%
 
32400.8%
 
4760.3%
 
7270.1%
 
6230.1%
 
5210.1%
 
14< 0.1%
 
2020-12-07T12:04:56.545426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:57.186053image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.3341
Min length1

PAY_4
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
16455 
-1
5687 
-2
4348 
2
3159 
3
 
180
Other values (6)
 
171
ValueCountFrequency (%) 
01645554.9%
 
-1568719.0%
 
-2434814.5%
 
2315910.5%
 
31800.6%
 
4690.2%
 
7580.2%
 
5350.1%
 
65< 0.1%
 
12< 0.1%
 
2020-12-07T12:04:57.777859image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:04:58.355989image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.3345
Min length1

PAY_5
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
16947 
-1
5539 
-2
4546 
2
2626 
3
 
178
Other values (5)
 
164
ValueCountFrequency (%) 
01694756.5%
 
-1553918.5%
 
-2454615.2%
 
226268.8%
 
31780.6%
 
4840.3%
 
7580.2%
 
5170.1%
 
64< 0.1%
 
81< 0.1%
 
2020-12-07T12:04:58.980936image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-07T12:04:59.355934image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:05:00.152811image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.336166667
Min length1

PAY_6
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
0
16286 
-1
5740 
-2
4895 
2
2766 
3
 
184
Other values (5)
 
129
ValueCountFrequency (%) 
01628654.3%
 
-1574019.1%
 
-2489516.3%
 
227669.2%
 
31840.6%
 
4490.2%
 
7460.2%
 
6190.1%
 
513< 0.1%
 
82< 0.1%
 
2020-12-07T12:05:00.746559image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:05:01.184062image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:05:02.027811image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.3545
Min length1

BILL_AMT1
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct22723
Distinct (%)75.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51223.3309
Minimum-165580
Maximum964511
Zeros2008
Zeros (%)6.7%
Memory size117.2 KiB
2020-12-07T12:05:02.719409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-165580
5-th percentile0
Q13558.75
median22381.5
Q367091
95-th percentile201203.05
Maximum964511
Range1130091
Interquartile range (IQR)63532.25

Descriptive statistics

Standard deviation73635.86058
Coefficient of variation (CV)1.437545339
Kurtosis9.806289341
Mean51223.3309
Median Absolute Deviation (MAD)21800.5
Skewness2.663861022
Sum1536699927
Variance5422239963
MonotocityNot monotonic
2020-12-07T12:05:03.266283image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
020086.7%
 
3902440.8%
 
780760.3%
 
326720.2%
 
316630.2%
 
2500590.2%
 
396490.2%
 
2400390.1%
 
416290.1%
 
500250.1%
 
Other values (22713)2733691.1%
 
ValueCountFrequency (%) 
-1655801< 0.1%
 
-1549731< 0.1%
 
-153081< 0.1%
 
-143861< 0.1%
 
-115451< 0.1%
 
ValueCountFrequency (%) 
9645111< 0.1%
 
7468141< 0.1%
 
6530621< 0.1%
 
6304581< 0.1%
 
6266481< 0.1%
 

BILL_AMT2
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct22346
Distinct (%)74.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49179.07517
Minimum-69777
Maximum983931
Zeros2506
Zeros (%)8.4%
Memory size117.2 KiB
2020-12-07T12:05:04.016284image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-69777
5-th percentile0
Q12984.75
median21200
Q364006.25
95-th percentile194792.2
Maximum983931
Range1053708
Interquartile range (IQR)61021.5

Descriptive statistics

Standard deviation71173.76878
Coefficient of variation (CV)1.447236829
Kurtosis10.30294592
Mean49179.07517
Median Absolute Deviation (MAD)20810
Skewness2.705220853
Sum1475372255
Variance5065705363
MonotocityNot monotonic
2020-12-07T12:05:04.745090image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
025068.4%
 
3902310.8%
 
326750.2%
 
780750.2%
 
316720.2%
 
2500510.2%
 
396510.2%
 
2400420.1%
 
-200290.1%
 
416280.1%
 
Other values (22336)2684089.5%
 
ValueCountFrequency (%) 
-697771< 0.1%
 
-675261< 0.1%
 
-333501< 0.1%
 
-300001< 0.1%
 
-262141< 0.1%
 
ValueCountFrequency (%) 
9839311< 0.1%
 
7439701< 0.1%
 
6715631< 0.1%
 
6467701< 0.1%
 
6244751< 0.1%
 

BILL_AMT3
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct22026
Distinct (%)73.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47013.1548
Minimum-157264
Maximum1664089
Zeros2870
Zeros (%)9.6%
Memory size117.2 KiB
2020-12-07T12:05:05.417017image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-157264
5-th percentile0
Q12666.25
median20088.5
Q360164.75
95-th percentile187821.05
Maximum1664089
Range1821353
Interquartile range (IQR)57498.5

Descriptive statistics

Standard deviation69349.38743
Coefficient of variation (CV)1.475106015
Kurtosis19.78325514
Mean47013.1548
Median Absolute Deviation (MAD)19708.5
Skewness3.087830046
Sum1410394644
Variance4809337537
MonotocityNot monotonic
2020-12-07T12:05:05.979516image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
028709.6%
 
3902750.9%
 
780740.2%
 
326630.2%
 
316620.2%
 
396480.2%
 
2500400.1%
 
2400390.1%
 
416290.1%
 
200270.1%
 
Other values (22016)2647388.2%
 
ValueCountFrequency (%) 
-1572641< 0.1%
 
-615061< 0.1%
 
-461271< 0.1%
 
-340411< 0.1%
 
-254431< 0.1%
 
ValueCountFrequency (%) 
16640891< 0.1%
 
8550861< 0.1%
 
6931311< 0.1%
 
6896431< 0.1%
 
6896271< 0.1%
 

BILL_AMT4
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct21548
Distinct (%)71.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43262.94897
Minimum-170000
Maximum891586
Zeros3195
Zeros (%)10.7%
Memory size117.2 KiB
2020-12-07T12:05:06.776340image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-170000
5-th percentile0
Q12326.75
median19052
Q354506
95-th percentile174333.35
Maximum891586
Range1061586
Interquartile range (IQR)52179.25

Descriptive statistics

Standard deviation64332.85613
Coefficient of variation (CV)1.487019671
Kurtosis11.30932483
Mean43262.94897
Median Absolute Deviation (MAD)18656
Skewness2.821965291
Sum1297888469
Variance4138716378
MonotocityNot monotonic
2020-12-07T12:05:07.323219image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0319510.7%
 
3902460.8%
 
7801010.3%
 
316680.2%
 
326620.2%
 
396440.1%
 
150390.1%
 
2400390.1%
 
2500340.1%
 
1000330.1%
 
Other values (21538)2613987.1%
 
ValueCountFrequency (%) 
-1700001< 0.1%
 
-813341< 0.1%
 
-651671< 0.1%
 
-506161< 0.1%
 
-466271< 0.1%
 
ValueCountFrequency (%) 
8915861< 0.1%
 
7068641< 0.1%
 
6286991< 0.1%
 
6168361< 0.1%
 
5728051< 0.1%
 

BILL_AMT5
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct21010
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40311.40097
Minimum-81334
Maximum927171
Zeros3506
Zeros (%)11.7%
Memory size117.2 KiB
2020-12-07T12:05:07.995141image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-81334
5-th percentile0
Q11763
median18104.5
Q350190.5
95-th percentile165794.3
Maximum927171
Range1008505
Interquartile range (IQR)48427.5

Descriptive statistics

Standard deviation60797.15577
Coefficient of variation (CV)1.508187617
Kurtosis12.30588129
Mean40311.40097
Median Absolute Deviation (MAD)17688.5
Skewness2.876379867
Sum1209342029
Variance3696294150
MonotocityNot monotonic
2020-12-07T12:05:08.588839image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0350611.7%
 
3902350.8%
 
780940.3%
 
316790.3%
 
326620.2%
 
150580.2%
 
396470.2%
 
2400390.1%
 
2500370.1%
 
416360.1%
 
Other values (21000)2580786.0%
 
ValueCountFrequency (%) 
-813341< 0.1%
 
-613721< 0.1%
 
-530071< 0.1%
 
-466271< 0.1%
 
-375941< 0.1%
 
ValueCountFrequency (%) 
9271711< 0.1%
 
8235401< 0.1%
 
5870671< 0.1%
 
5517021< 0.1%
 
5478801< 0.1%
 

BILL_AMT6
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct20604
Distinct (%)68.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38871.7604
Minimum-339603
Maximum961664
Zeros4020
Zeros (%)13.4%
Memory size117.2 KiB
2020-12-07T12:05:09.229467image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-339603
5-th percentile0
Q11256
median17071
Q349198.25
95-th percentile161912
Maximum961664
Range1301267
Interquartile range (IQR)47942.25

Descriptive statistics

Standard deviation59554.10754
Coefficient of variation (CV)1.53206613
Kurtosis12.27070529
Mean38871.7604
Median Absolute Deviation (MAD)16755
Skewness2.846644576
Sum1166152812
Variance3546691724
MonotocityNot monotonic
2020-12-07T12:05:09.963837image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0402013.4%
 
3902070.7%
 
780860.3%
 
150780.3%
 
316770.3%
 
326560.2%
 
396450.1%
 
416360.1%
 
-18330.1%
 
2400320.1%
 
Other values (20594)2533084.4%
 
ValueCountFrequency (%) 
-3396031< 0.1%
 
-2090511< 0.1%
 
-1509531< 0.1%
 
-946251< 0.1%
 
-738951< 0.1%
 
ValueCountFrequency (%) 
9616641< 0.1%
 
6999441< 0.1%
 
5686381< 0.1%
 
5277111< 0.1%
 
5275661< 0.1%
 

PAY_AMT1
Real number (ℝ≥0)

ZEROS

Distinct7943
Distinct (%)26.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5663.5805
Minimum0
Maximum873552
Zeros5249
Zeros (%)17.5%
Memory size117.2 KiB
2020-12-07T12:05:10.682642image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11000
median2100
Q35006
95-th percentile18428.2
Maximum873552
Range873552
Interquartile range (IQR)4006

Descriptive statistics

Standard deviation16563.28035
Coefficient of variation (CV)2.924524575
Kurtosis415.2547427
Mean5663.5805
Median Absolute Deviation (MAD)1932
Skewness14.66836433
Sum169907415
Variance274342256.1
MonotocityNot monotonic
2020-12-07T12:05:11.088891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0524917.5%
 
200013634.5%
 
30008913.0%
 
50006982.3%
 
15005071.7%
 
40004261.4%
 
100004011.3%
 
10003651.2%
 
25002981.0%
 
60002941.0%
 
Other values (7933)1950865.0%
 
ValueCountFrequency (%) 
0524917.5%
 
19< 0.1%
 
214< 0.1%
 
3150.1%
 
4180.1%
 
ValueCountFrequency (%) 
8735521< 0.1%
 
5050001< 0.1%
 
4933581< 0.1%
 
4239031< 0.1%
 
4050161< 0.1%
 

PAY_AMT2
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct7899
Distinct (%)26.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5921.1635
Minimum0
Maximum1684259
Zeros5396
Zeros (%)18.0%
Memory size117.2 KiB
2020-12-07T12:05:11.479514image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1833
median2009
Q35000
95-th percentile19004.35
Maximum1684259
Range1684259
Interquartile range (IQR)4167

Descriptive statistics

Standard deviation23040.8704
Coefficient of variation (CV)3.891274139
Kurtosis1641.631911
Mean5921.1635
Median Absolute Deviation (MAD)1991
Skewness30.45381745
Sum177634905
Variance530881708.9
MonotocityNot monotonic
2020-12-07T12:05:11.870088image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0539618.0%
 
200012904.3%
 
30008572.9%
 
50007172.4%
 
10005942.0%
 
15005211.7%
 
40004101.4%
 
100003181.1%
 
60002830.9%
 
25002510.8%
 
Other values (7889)1936364.5%
 
ValueCountFrequency (%) 
0539618.0%
 
1150.1%
 
2200.1%
 
3180.1%
 
411< 0.1%
 
ValueCountFrequency (%) 
16842591< 0.1%
 
12270821< 0.1%
 
12154711< 0.1%
 
10245161< 0.1%
 
5804641< 0.1%
 

PAY_AMT3
Real number (ℝ≥0)

ZEROS

Distinct7518
Distinct (%)25.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5225.6815
Minimum0
Maximum896040
Zeros5968
Zeros (%)19.9%
Memory size117.2 KiB
2020-12-07T12:05:12.606039image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1390
median1800
Q34505
95-th percentile17589.4
Maximum896040
Range896040
Interquartile range (IQR)4115

Descriptive statistics

Standard deviation17606.96147
Coefficient of variation (CV)3.36931393
Kurtosis564.3112295
Mean5225.6815
Median Absolute Deviation (MAD)1795
Skewness17.21663544
Sum156770445
Variance310005092.2
MonotocityNot monotonic
2020-12-07T12:05:13.121665image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0596819.9%
 
200012854.3%
 
100011033.7%
 
30008702.9%
 
50007212.4%
 
15004901.6%
 
40003811.3%
 
100003121.0%
 
12002430.8%
 
60002410.8%
 
Other values (7508)1838661.3%
 
ValueCountFrequency (%) 
0596819.9%
 
113< 0.1%
 
2190.1%
 
314< 0.1%
 
4150.1%
 
ValueCountFrequency (%) 
8960401< 0.1%
 
8890431< 0.1%
 
5082291< 0.1%
 
4175881< 0.1%
 
4009721< 0.1%
 

PAY_AMT4
Real number (ℝ≥0)

ZEROS

Distinct6937
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4826.076867
Minimum0
Maximum621000
Zeros6408
Zeros (%)21.4%
Memory size117.2 KiB
2020-12-07T12:05:13.527912image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1296
median1500
Q34013.25
95-th percentile16014.95
Maximum621000
Range621000
Interquartile range (IQR)3717.25

Descriptive statistics

Standard deviation15666.15974
Coefficient of variation (CV)3.246147995
Kurtosis277.3337677
Mean4826.076867
Median Absolute Deviation (MAD)1500
Skewness12.90498482
Sum144782306
Variance245428561.1
MonotocityNot monotonic
2020-12-07T12:05:13.902919image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0640821.4%
 
100013944.6%
 
200012144.0%
 
30008873.0%
 
50008102.7%
 
15004411.5%
 
40004021.3%
 
100003411.1%
 
25002590.9%
 
5002580.9%
 
Other values (6927)1758658.6%
 
ValueCountFrequency (%) 
0640821.4%
 
1220.1%
 
2220.1%
 
313< 0.1%
 
4200.1%
 
ValueCountFrequency (%) 
6210001< 0.1%
 
5288971< 0.1%
 
4970001< 0.1%
 
4321301< 0.1%
 
4000461< 0.1%
 

PAY_AMT5
Real number (ℝ≥0)

ZEROS

Distinct6897
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4799.387633
Minimum0
Maximum426529
Zeros6703
Zeros (%)22.3%
Memory size117.2 KiB
2020-12-07T12:05:14.576265image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1252.5
median1500
Q34031.5
95-th percentile16000
Maximum426529
Range426529
Interquartile range (IQR)3779

Descriptive statistics

Standard deviation15278.30568
Coefficient of variation (CV)3.183386475
Kurtosis180.0639402
Mean4799.387633
Median Absolute Deviation (MAD)1500
Skewness11.12741705
Sum143981629
Variance233426624.4
MonotocityNot monotonic
2020-12-07T12:05:15.091888image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0670322.3%
 
100013404.5%
 
200013234.4%
 
30009473.2%
 
50008142.7%
 
15004261.4%
 
40004011.3%
 
100003431.1%
 
5002500.8%
 
60002470.8%
 
Other values (6887)1720657.4%
 
ValueCountFrequency (%) 
0670322.3%
 
1210.1%
 
213< 0.1%
 
313< 0.1%
 
412< 0.1%
 
ValueCountFrequency (%) 
4265291< 0.1%
 
4179901< 0.1%
 
3880711< 0.1%
 
3792671< 0.1%
 
3320001< 0.1%
 

PAY_AMT6
Real number (ℝ≥0)

ZEROS

Distinct6939
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5215.502567
Minimum0
Maximum528666
Zeros7173
Zeros (%)23.9%
Memory size117.2 KiB
2020-12-07T12:05:15.748140image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1117.75
median1500
Q34000
95-th percentile17343.8
Maximum528666
Range528666
Interquartile range (IQR)3882.25

Descriptive statistics

Standard deviation17777.46578
Coefficient of variation (CV)3.408581541
Kurtosis167.1614296
Mean5215.502567
Median Absolute Deviation (MAD)1500
Skewness10.64072733
Sum156465077
Variance316038289.4
MonotocityNot monotonic
2020-12-07T12:05:16.248139image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0717323.9%
 
100012994.3%
 
200012954.3%
 
30009143.0%
 
50008082.7%
 
15004391.5%
 
40004111.4%
 
100003561.2%
 
5002470.8%
 
60002200.7%
 
Other values (6929)1683856.1%
 
ValueCountFrequency (%) 
0717323.9%
 
1200.1%
 
29< 0.1%
 
314< 0.1%
 
412< 0.1%
 
ValueCountFrequency (%) 
5286661< 0.1%
 
5271431< 0.1%
 
4430011< 0.1%
 
4220001< 0.1%
 
4035001< 0.1%
 

DEFAULT
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.4 KiB
not default
23364 
default
6636 
ValueCountFrequency (%) 
not default2336477.9%
 
default663622.1%
 
2020-12-07T12:05:16.873143image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-07T12:05:17.164494image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:05:17.504492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length11
Median length11
Mean length10.1152
Min length7

Interactions

2020-12-07T12:02:24.519385image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:25.257435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:25.789447image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:26.387385image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:27.000388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:27.798440image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:28.300385image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:28.926489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:29.521541image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:30.137819image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:30.731569image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:31.309741image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:31.918241image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:32.433911image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:33.105737image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:33.784194image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:34.471694image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:35.081067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:35.690442image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:36.221691image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:36.706120image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:37.237318image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:37.964848image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:38.417970image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:38.886717image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:39.308589image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:40.027291image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:40.480416image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:40.933541image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:41.386665image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:42.039439image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:42.680068image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:43.258239image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:43.695684image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:44.342468image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:44.936216image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:45.779968image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:46.467515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:46.967464image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:47.701841image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:48.248719image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:48.904974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:49.545591image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:50.139341image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:50.717465image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:51.436219image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:52.192447image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:52.848695image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:53.411195image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:54.075650image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:54.716270image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:55.481894image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:56.013147image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:56.606902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:57.439251image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:58.048625image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:58.704878image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:59.267377image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:02:59.783000image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:00.408006image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:00.923625image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:01.579887image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:02.188738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:02.860666image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:03.470036image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:04.166037image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:04.697291image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:05.322283image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:05.822338image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:06.306661image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:06.900412image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:07.447288image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:07.900409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:08.400415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:09.291037image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:09.947284image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:10.681660image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:11.353583image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:11.982319image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:12.732313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:13.372938image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:14.099650image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:14.818401image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:15.443397image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:16.146575image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:16.825922image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:17.523068image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:18.167121image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:18.699066image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:19.348116image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:20.049118image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:20.626070image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:21.305067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:21.839065image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:22.443067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:23.142070image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:23.730067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:24.232067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:24.630067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:25.339066image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:25.987069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:26.580067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:27.112117image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:27.789067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:28.384069image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:28.975067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:29.447121image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:30.082485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:30.676233image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:31.144983image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:31.676241image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:32.304210image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:32.882283image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:33.429158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:34.106766image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:34.700474image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:35.294223image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:35.872391image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:36.356771image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:36.934845image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:37.416654image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:38.385407image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:38.976450image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:39.617030image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:40.367074image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:40.945155image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:41.710776image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:42.268669image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:42.940543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:43.456168image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:43.924915image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:44.388440image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:44.982138image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:45.560261image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:45.997760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:46.466566image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:46.950889image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:47.560319image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:48.185267image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:48.732138image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:49.263387image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:49.872763image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:50.372770image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:50.966513image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:51.571087image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:52.006392image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:52.527496image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:53.081672image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:53.675423image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:54.287318image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:54.881110image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:55.521690image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:56.162313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:56.974860image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:57.655493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:58.311744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:58.811747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:03:59.483669image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:00.046118image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:00.749295image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:01.327374image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:02.050744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:02.691365image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:03.332038image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:03.910115image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:04.708478image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:05.317846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:05.849098image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:06.317895image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:06.958473image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:07.427224image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:08.067847image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:08.552220image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:08.896023image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:09.474144image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:09.989722image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:10.630344image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:11.161598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:12.278961image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:12.685211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:13.153962image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:13.825835image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:14.373698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:14.873701image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:15.514373image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:16.029950image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:16.795623image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:17.404965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:17.980020image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:18.692965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:19.316964image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:19.967972image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:20.635965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:21.270014image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:21.757965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:22.457966image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:22.956018image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:23.395965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:23.828965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:24.286964image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:24.867964image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:25.467967image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:26.011963image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:26.427019image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:26.850014image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:27.436434image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:28.030183image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:28.452055image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:28.920806image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:29.327056image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:29.717679image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:30.108305image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:30.436430image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:30.748932image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:31.202056image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:31.842735image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:32.383491image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:33.028142image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:33.450012image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:33.934385image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:34.382384image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:34.913645image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:35.491814image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:36.023017image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:36.569884image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-12-07T12:05:18.188543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-07T12:05:19.810493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-07T12:05:21.015493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-07T12:05:22.367493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-07T12:05:23.915543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-07T12:04:37.929747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-12-07T12:04:43.770857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

df_indexIDLIMIT_BALSEXEDUCATIONMARRIAGEAGEPAY_0PAY_2PAY_3PAY_4PAY_5PAY_6BILL_AMT1BILL_AMT2BILL_AMT3BILL_AMT4BILL_AMT5BILL_AMT6PAY_AMT1PAY_AMT2PAY_AMT3PAY_AMT4PAY_AMT5PAY_AMT6DEFAULT
01120000femaleuniversity12422-1-1-2-23913310268900006890000default
122120000femaleuniversity226-120002268217252682327234553261010001000100002000default
23390000femaleuniversity234000000292391402713559143311494815549151815001000100010005000not default
34450000femaleuniversity137000000469904823349291283142895929547200020191200110010691000not default
45550000maleuniversity157-10-10008617567035835209401914619131200036681100009000689679not default
56650000malegraduate school2370000006440057069576081939419619200242500181565710001000800not default
677500000malegraduate school229000000367965412023445007542653483003473944550004000038000202391375013770not default
788100000femaleuniversity2230-1-100-111876380601221-159567380601058116871542not default
899140000femalehigh school1280020001128514096121081221111793371933290432100010001000not default
9101020000malehigh school235-2-2-2-2-1-1000013007139120001300711220not default

Last rows

df_indexIDLIMIT_BALSEXEDUCATIONMARRIAGEAGEPAY_0PAY_2PAY_3PAY_4PAY_5PAY_6BILL_AMT1BILL_AMT2BILL_AMT3BILL_AMT4BILL_AMT5BILL_AMT6PAY_AMT1PAY_AMT2PAY_AMT3PAY_AMT4PAY_AMT5PAY_AMT6DEFAULT
299903019429991140000maleuniversity1410000001383251371421391101382624967546121600070004228150520002000not default
299913019529992210000maleuniversity134322222250025002500250025002500000000default
29992301962999310000malehigh school143000-2-2-28802104000000200000000not default
299933019729994100000malegraduate school2380-1-10003042142710299670626694735500420001117844000300020002000not default
29994301982999580000maleuniversity234222222725577770879384775198260781158700035000700004000default
299953019929996220000malehigh school1390000001889481928152083658800431237159808500200005003304750001000not default
299963020029997150000malehigh school243-1-1-1-10016831828350289795190018373526899812900not default
29997302012999830000maleuniversity237432-1003565335627582087820582193570022000420020003100default
29998302022999980000malehigh school1411-1000-1-1645783797630452774118554894485900340911781926529641804default
29999302033000050000maleuniversity146000000479294890549764365353242815313207818001430100010001000default